Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions.

نویسندگان

  • Jianfen Ma
  • Yi Hu
  • Philipos C Loizou
چکیده

The articulation index (AI), speech-transmission index (STI), and coherence-based intelligibility metrics have been evaluated primarily in steady-state noisy conditions and have not been tested extensively in fluctuating noise conditions. The aim of the present work is to evaluate the performance of new speech-based STI measures, modified coherence-based measures, and AI-based measures operating on short-term (30 ms) intervals in realistic noisy conditions. Much emphasis is placed on the design of new band-importance weighting functions which can be used in situations wherein speech is corrupted by fluctuating maskers. The proposed measures were evaluated with intelligibility scores obtained by normal-hearing listeners in 72 noisy conditions involving noise-suppressed speech (consonants and sentences) corrupted by four different maskers (car, babble, train, and street interferences). Of all the measures considered, the modified coherence-based measures and speech-based STI measures incorporating signal-specific band-importance functions yielded the highest correlations (r=0.89-0.94). The modified coherence measure, in particular, that only included vowel/consonant transitions and weak consonant information yielded the highest correlation (r=0.94) with sentence recognition scores. The results from this study clearly suggest that the traditional AI and STI indices could benefit from the use of the proposed signal- and segment-dependent band-importance functions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SNR loss: A new objective measure for predicting the intelligibility of noise-suppressed speech

Most of the existing intelligibility measures do not account for the distortions present in processed speech, such as those introduced by speech-enhancement algorithms. In the present study, we propose three new objective measures that can be used for prediction of intelligibility of processed (e.g., via an enhancement algorithm) speech in noisy conditions. All three measures use a critical-ban...

متن کامل

Speech Intelligibility Improvement in Noisy Environments for Near-End Listening Enhancement

A new speech intelligibility improvement method for near-end listening enhancement in noisy environments is proposed. This method improves speech intelligibility by optimizing energy correlation of one-third octave bands of clean speech and enhanced noisy speech without power increasing. The energy correlation is determined as a cost function based on frequency band gains of the clean speech. I...

متن کامل

Prediction of intelligibility of noisy and time-frequency weighted speech based on mutual information between amplitude envelopes

This paper deals with the problem of predicting the average intelligibility of noisy and potentially processed speech signals, as observed by a group of normal hearing listeners. We propose a prediction model based on the hypothesis that intelligibility is monotonically related to the the amount of Shannon information the critical-band amplitude envelopes of the noisy/processed signal convey ab...

متن کامل

Comparative investigation of objective speech intelligibility prediction measures for noise-reduced signals in Mandarin and Japanese

In this paper, eight state-of-the-art objective speech intelligibility prediction measures are comparatively investigated for noisy signals before and after noise-reduction processing between Mandarin and Japanese. Clean speech signals (Chinese words and Japanese words) were first corrupted by three types of noise at two signal-to-noise ratios and then processed by normal-hearing listeners for ...

متن کامل

Predicting the intelligibility of vocoded speech.

OBJECTIVES The purpose of this study is to evaluate the performance of a number of speech intelligibility indices in terms of predicting the intelligibility of vocoded speech. DESIGN Noise-corrupted sentences were vocoded in a total of 80 conditions, involving three different signal-to-noise ratio levels (-5, 0, and 5 dB) and two types of maskers (steady state noise and two-talker). Tone-voco...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • The Journal of the Acoustical Society of America

دوره 125 5  شماره 

صفحات  -

تاریخ انتشار 2009